-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use SavedModel instead of HDF5 format, fix dewarping #89
Conversation
- move model loading into `setup` in constructor context - allow directories as models (TF SavedModel format), too - use correct pageId - simplify and polish
use custom dataset class for in-memory PIL.Image passing instead of file-based repurposed `AlignedDataset` (since (this is faster, and reliable: OCR-D does not guarantee us a `.filename` for derived images; also, does not create temporary files in the input fileGrp anymore)
after decoding, convert tensor to array with due respect for proper channel and dynamic range coding (instead of ad-hoc conversion); then resize while still in RGB and re-binarize (instead of ad-hoc binarization followed by resizing in binary)
- rebase on pix2pixHD#293 (CPU-only option, Torch>=1.0, less verbose, arg passing) - pass args to pix2pixHD directly (instead of sys.args hijacking) - no unneccesary verbosity (and only through loggers) - move model loading into startup context via `setup` fn - rename params: * `imgresize` → `resize_mode`, * `resizeHeight` → `resize_height` * `resizeWidth` → `resize_width` - add proper documentation - fix region-level results
(just BIN is not enough / not as good / not realistic)
Now also depends on NVIDIA/pix2pixHD#293, and contains various other fixes, mostly regarding dewarping. Fixes #34, #35, #40, #60, #61, #72, #73, #77, #87, #88, and probably #42 (see below – with With better upsampling/re-binarization, the quality of the dewarper has also improved a little. It is obviously not a good idea to downsample in the first place (which is the case with the default Here are some examples based on the dewarped with default settings:
dewarped with default settings but on GPU:
dewarped with larger size (less resampling/interpolation):
dewarped with original/full image size:
dewarped on cropped but raw RGB (just to show that the models have not been trained on such data):
|
Like I said, we still need to upload the new models, and update the resource URLs. (This is the reason the CI still fails.) |
On Python 3.8, you get errors trying to load the existing HDF5 models for Tensorflow processors
tiseg
andlayout-analysis
.However, Tensorflow offers a more stable alternative: SavedModel directories. I have converted the existing models an adapted the code to make them runnable again.
Now, how do we redistribute these? I have uploaded them as tarballs here and here. But really they should go to https://ocr-d-repo.scc.kit.edu/models/dfki as well.
As soon as we get OCR-D/core#800 done, we should then be able to update the resource list in ocrd-tool.json, right?
Another dependency is in the processors using
ocrolib.morph
, i.e.nlbin
andtextline
: OCR-D/ocropy#2 – @kba, as soon as you have merged and publishedocrd-fork-ocropy==1.4.0a4
, this is ready to go.